Minimum length RNA folding trajectories

نویسندگان

  • Amir H. Bayegan
  • Peter Clote
چکیده

Background: Existent programs for RNA folding kinetics, such as Kinefold, Kinfold and KFOLD,implement the Gillespie algorithm to generate stochastic folding trajectories from an initial structure sto a target structure t, in which each intermediate secondary structure is obtained from its predecessorby the application of a move from a given move set. The Kinfold move set MS1 [resp. MS2] allows theaddition or removal [resp. addition, removal or shift] of a single base pair. Define the MS1 [resp. MS2]distance between secondary structures s and t to be the minimum path length to refold s to t, wherea move from MS1 [resp. MS2] is applied in each step. The MS1 distance between s and t is triviallyequal to the cardinality of the symmetric difference of s and t, i.e the number of base pairs belonging toone structure but not the other; in contrast, the computation of MS2 distance is highly non-trivial.Results: We describe algorithms to compute the shortest MS2 folding trajectory between any two givenRNA secondary structures. These algorithms include an optimal integer programming (IP) algorithm,an accurate and efficient near-optimal algorithm, a greedy algorithm, a branch-and-bound algorithm,and an optimal algorithm if one allows intermediate structures to contain pseudoknots. A 10-fold slowerversion of our IP algorithm appeared in WABI 2017; the current version exploits special treatment ofclosed 2-cycles.Our optimal IP [resp. near-optimal IP] algorithm maximizes [resp. approximately maximizes] thenumber of shifts and minimizes [resp. approximately minimizes] the number of base pair additions andremovals by applying integer programming to (essentially) solve the minimum feedback vertex set (FVS)problem for the RNA conflict digraph, then applies topological sort to tether subtrajectories into thefinal optimal folding trajectory.We prove NP-hardness of the problem to determine the minimum barrier energy over all possibleMS2 folding pathways, and conjecture that computing the MS2 distance between arbitrary secondarystructures is NP-hard. Since our optimal IP algorithm relies on the FVS, known to be NP-completefor arbitrary digraphs, we compare the family of RNA conflict digraphs with the following classes ofdigraphs – planar, reducible flow graph, Eulerian, and tournament – for which FVS is known to be eitherpolynomial time computable or NP-hard.Conclusion: This paper describes a number of optimal and near-optimal algorithms to compute theshortest MS2 folding trajectory between any two secondary structures. Source code for our algorithmsis available at http://bioinformatics.bc.edu/clotelab/MS2distance/.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Folding kinetics of large RNAs.

We introduce here a heuristic approach to kinetic RNA folding that constructs secondary structures by stepwise combination of building blocks. These blocks correspond to subsequences and their thermodynamically optimal structures. These are determined by the standard dynamic programming approach to RNA folding. Folding trajectories are modeled at base-pair resolution using the Morgan-Higgs heur...

متن کامل

Structural RNA has lower folding energy than random RNA of the same dinucleotide frequency.

We present results of computer experiments that indicate that several RNAs for which the native state (minimum free energy secondary structure) is functionally important (type III hammerhead ribozymes, signal recognition particle RNAs, U2 small nucleolar spliceosomal RNAs, certain riboswitches, etc.) all have lower folding energy than random RNAs of the same length and dinucleotide frequency. A...

متن کامل

Basin Hopping Graph: a computational framework to characterize RNA folding landscapes

MOTIVATION RNA folding is a complicated kinetic process. The minimum free energy structure provides only a static view of the most stable conformational state of the system. It is insufficient to give detailed insights into the dynamic behavior of RNAs. A sufficiently sophisticated analysis of the folding free energy landscape, however, can provide the relevant information. RESULTS We introdu...

متن کامل

Direct observation of hierarchical folding in single riboswitch aptamers.

Riboswitches regulate genes through structural changes in ligand-binding RNA aptamers. With the use of an optical-trapping assay based on in situ transcription by a molecule of RNA polymerase, single nascent RNAs containing pbuE adenine riboswitch aptamers were unfolded and refolded. Multiple folding states were characterized by means of both force-extension curves and folding trajectories unde...

متن کامل

Transition path times for nucleic Acid folding determined from energy-landscape analysis of single-molecule trajectories.

The duration of structural transitions in biopolymers is only a fraction of the time spent searching diffusively over the configurational energy landscape. We found the transition time, τ(TP), and the diffusion constant, D, for DNA and RNA folding using energy landscapes obtained from single-molecule trajectories under tension in optical traps. DNA hairpins, RNA pseudoknots, and a riboswitch al...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1802.06328  شماره 

صفحات  -

تاریخ انتشار 2018